Part II: Modern Practical Deep Networks

1. Introduction

Deep feedforward networks, also called feedforward neural networks, or multilayer perceptrons(MLPs).

  • Goal: to approximate some function \(f^*(x)\).
  • Example: A classifier such as Digit Recognizer . We are trying to map an input \(\boldsymbol{x}\) to a category \(y\). As the example of Digit Recognizer, the input is an image of hand written digit. We are trying to approximate this function \(f^*(x)\) to map this image to the correct digit.

It forms the basic of many important commercial application such as Convolutional Neural Network and Recurrent Neural Network.

2. Why we call it Network?

It represent the combination of different functions. For example, we might have three functions \(f_1, f_2 and f_3\) connectedin a chain, to form \(f(x) = f_3(f_2(f_1(x)))\). These chain structures are the most commonly used structures of neural networks. In this case, \(f_1\) is called the first layerof the network, \(f_2\) is called the second layer, and so on.

The final layer of a feedforward network is called theoutput layer. During neural network training, we drive \(f(x)\) to match \(f^∗(x)\).

3. Hidden Layers

The training examples specify directly what the output layer must do at each point \(x\); it must produce a value that is close to \(y\). The behavior of the other layers is not directly specified by the training data. The learning algorithm must decide how to use those layers to produce the desired output, but the training data do not say what each individual layer should do. Instead, the learning algorithm must decide how to use these layers to best implement an approximation of \(f^∗\). Because the training data does not show the desired output for each of these layers, they are called hidden layers